Improving Retrievability of Patents in Prior-Art Search

نویسندگان

  • Shariq Bashir
  • Andreas Rauber
چکیده

Prior-art search is an important task in patent retrieval. The success of this task relies upon the selection of relevant search queries. Typically terms for prior-art queries are extracted from the claim fields of query patents. However, due to the complex technical structure of patents, and presence of terms mismatch and vague terms, selecting relevant terms for queries is a difficult task. During evaluating the patents retrievability coverage of prior-art queries generated from query patents, a large bias toward a subset of the collection is experienced. A large number of patents either have a very low retrievability score or can not be discovered via any query. To increase the retrievability of patents, in this paper we expand prior-art queries generated from query patents using query expansion with pseudo relevance feedback. Missing terms from query patents are discovered from feedback patents, and better patents for relevance feedback are identified using a novel approach for checking their similarity with query patents. We specifically focus on how to automatically select better terms from query patents based on their proximity distribution with prior-art queries that are used as features for computing similarity. Our results show, that the coverage of prior-art queries can be increased significantly by incorporating relevant queries terms using query expansion.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Study of Patent Prior Art Retrieval Using Claim Structure and Link Analysis

Prior art retrieval plays an important role in patent examination. If a patent examiner can quickly and accurately retrieve prior art for an application patent, s/he will be able to efficiently and effectively judge the novelty of an application patent, and in turn, avoid hampering the technology development of the application domain. Moreover, in order to avoid losing tangible and intangible a...

متن کامل

Automatic Learning of A Supervised Classifier for Patent Prior Art Retrieval

Prior art retrieval is the process of determining a set of possibly relevant prior arts for a specific patent or patent application. Such process is essential for various patent practices, e.g. patentability search, validity search, and infringement search. To support the automatic retrieval of prior arts, existing studies generally adopt the traditional information retrieval (IR) approach or e...

متن کامل

Analyzing Document Retrievability in Patent Retrieval Settings

Most information retrieval settings, such as web search, are typically precision-oriented, i.e. they focus on retrieving a small number of highly relevant documents. However, in specific domains, such as patent retrieval or law, recall becomes more relevant than precision: in these cases the goal is to find all relevant documents, requiring algorithms to be tuned more towards recall at the cost...

متن کامل

Improving Retrievability and Recall by Automatic Corpus Partitioning

With increasing volumes of data, much effort has been devoted to finding the most suitable answer to an information need. However, in many domains, the question whether any specific information item can be found at all via a reasonable set of queries is essential. This concept of Retrievability of information has evolved into an important evaluation measure of IR systems in recall-oriented appl...

متن کامل

Prior Art Search in Chemistry Patents Based On Semantic Concepts and Co-Citation Analysis

Prior Art Search is a task of querying and retrieving the patents in order to uncover any knowledge existing prior to the inventor’s question or invention at hand. For addressing this task, we present a contemporary approach that has been evaluated during Trecchem for its ability to adapt to text containing chemistry-based information. The core of the framework is an index of 1.3 million chemis...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010